13 research outputs found
Comprehensive Complexity Assessment of Emerging Learned Image Compression on CPU and GPU
Learned Compression (LC) is the emerging technology for compressing image and
video content, using deep neural networks. Despite being new, LC methods have
already gained a compression efficiency comparable to state-of-the-art image
compression, such as HEVC or even VVC. However, the existing solutions often
require a huge computational complexity, which discourages their adoption in
international standards or products. This paper provides a comprehensive
complexity assessment of several notable methods, that shed light on the
matter, and guide the future development of this field by presenting key
findings. To do so, six existing methods have been evaluated for both encoding
and decoding, on CPU and GPU platforms. Various aspects of complexity such as
the overall complexity, share of each coding module, number of operations,
number of parameters, most demanding GPU kernels, and memory requirements have
been measured and compared on Kodak dataset. The reported results (1) quantify
the complexity of LC methods, (2) fairly compare different methods, and (3) a
major contribution of the work is identifying and quantifying the key factors
affecting the complexity
Complexity Analysis Of Next-Generation VVC Encoding and Decoding
While the next generation video compression standard, Versatile Video Coding
(VVC), provides a superior compression efficiency, its computational complexity
dramatically increases. This paper thoroughly analyzes this complexity for both
encoder and decoder of VVC Test Model 6, by quantifying the complexity
break-down for each coding tool and measuring the complexity and memory
requirements for VVC encoding/decoding. These extensive analyses are performed
for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD),
Random-Access (RA), and All-Intra (AI) conditions (a total of 320
encoding/decoding). Results indicate that the VVC encoder and decoder are 5x
and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI,
respectively. Detailed analysis of coding tools reveals that in LD on average,
motion estimation tools with 53%, transformation and quantization with 22%, and
entropy coding with 7% dominate the encoding complexity. In decoding, loop
filters with 30%, motion compensation with 20%, and entropy decoding with 16%,
are the most complex modules. Moreover, the required memory bandwidth for VVC
encoding/decoding are measured through memory profiling, which are 30x and 3x
of HEVC. The reported results and insights are a guide for future research and
implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202
SVM based approach for complexity control of HEVC intra coding
The High Efficiency Video Coding (HEVC) is adopted by various video applications in recent years. Because of its high computational demand, controlling the complexity of HEVC is of paramount importance to appeal to the varying requirements in many applications, including power-constrained video coding, video streaming, and cloud gaming. Most of the existing complexity control methods are only capable of considering a subset of the decision space, which leads to low coding efficiency. While the efficiency of machine learning methods such as Support Vector Machines (SVM) can be employed for higher precision decision making, the current SVM-based techniques for HEVC provide a fixed decision boundary which results in different coding complexities for different video content. Although this might be suitable for complexity reduction, it is not acceptable for complexity control. This paper proposes an adjustable classification approach for Coding Unit (CU) partitioning, which addresses the mentioned problems of complexity control. Firstly, a novel set of features for fast CU partitioning is designed using image processing techniques. Then, a flexible classification method based on SVM is proposed to model the CU partitioning problem. This approach allows adjusting the performance-complexity trade-off, even after the training phase. Using this model, and a novel adaptive thresholding technique, an algorithm is presented to deliver video encoding within the target coding complexity, while maximizing the coding efficiency. Experimental results justify the superiority of this method over the state-of-the-art methods, with target complexities ranging from 20% to 100%.acceptedVersionPeer reviewe
Complexity Reduction and Control Techniques for Power-Constrained Video Coding
The ever-increasing demand for multimedia content imposes a big challenge for video compression. Today, smartphones are widely popular and in use. The high-quality cameras and high-resolution screens on these devices have elevated the user expectations for a higher resolution and higher quality video content. As a result, more High Definition (HD) and Ultra High Definition (UHD) content are being shared by users. The High Efficiency Video Coding (HEVC/H.265) was introduced as the latest video coding standard by ITU and MPEG, to cope with the high bitrate of this large content. It can provide almost twice the coding efficiency, compared to its predecessor, H.264/AVC. However, the encoding process for this standard includes algorithms with a higher computational complexity, due to the newly introduced coding tools, and a higher precision for the existing tools. As a consequence, HEVC encoding is power demanding, and thus hard to employ especially on battery-powered devices. Furthermore, many applications e.g. video streaming, cloud gaming, and power-constraint video coding, are sensitive to the coding delay (or power), and their tolerable delay (or processing power) varies through time. For this reason, not only reducing the complexity of video encoding is important, but a power-adaptive mechanism is crucial to enable the encoding within the defined power/delay quota.
To address these issues, this thesis investigates the complexity of HEVC encoding and presents novel complexity reduction and complexity control algorithms for HEVC encoding. These contributions can be categorized in three main parts. The first contribution of this thesis considers the high complexity and power consumption of motion estimation in HEVC. Specifically, the baseline encoding algorithm is analyzed from a memory access point of view, which contributes heavily to the total power consumption of video encoding. A simple yet effective approach is presented that replaces the first step of the search for finding the best starting point, and also adaptively reduces the search range. This method reduces the memory access and encoding time, with negligible loss of coding efficiency.
The second contribution of the thesis is fast intra prediction. As the complexity of intra coding has particularly increased in HEVC, more investigations have been dedicated to this module. Two novel methods have been proposed that accelerate the intra prediction through fast texture analysis. The first approach adopts filters of a Dual-Tree ComplexWavelet Transform(DT-CWT) to estimate the texture direction of each intra block. The second approach exploits the potentials of internal tools integrated in the HEVC engine, i.e. the planar filter and the entropy engine, to prune unnecessary computations.
The third contribution of this work is a machine-learning driven approach for controlling the complexity of HEVC encoding, through fast Coding Unit (CU) partitioning. To this end, a feature set is designed for CU partitioning, which uses DTCWT for advanced texture analysis. Then an adaptive classification approach is presented that decides the termination or skipping of eachCUdepth. The performancecomplexity of this classification scheme can be adjusted through a set of thresholds. Finally, the complexity control problem is modeled and solved as a constraint optimization problem, where the loss of the coding efficiency is minimized while the coding complexity is set to meet the target level.
The effectiveness of the proposed algorithms is verified through extensive experiments over the video dataset suggested in the common test conditions of the standardization committee
A low complexity and computationally scalable fast motion estimation algorithm for HEVC
Motion Estimation (ME) is one of the most computationally demanding parts of video encoders. The Test Zone (TZ) search is a popular fast ME algorithm, which is recommended for High-Efficiency Video Coding (HEVC). While the TZ search achieves an excellent coding efficiency, it is not a favorable choice for hardware implementations due to 1) a relatively high computational complexity, 2) inducing data dependency among the neighboring blocks, which complicates hardware implementations and parallel processing in software implementations, and 3) lack of computational adjustability, which is required for video encoding in power-constrained devices. This paper diagnoses the cause of these issues to be in the multiple starting search points of the TZ search algorithm. Accordingly, a method is proposed to find a single reliable starting point that replaces the first step of the TZ search algorithm. To do so, both current and reference frames are analyzed using a complex wavelet transform, and similar salient points are identified among the two frames. Then a light-weight process is used to match these points to find a single reliable starting point. The reliability of this point leads to reduced zonal refinement range with negligible cost in compression efficiency. Since adjusting the refinement range can be used as an effective way for adjusting the complexity, this results in a computationally scalable ME algorithm, named FMECWT. In contrast to the existing methods, FMECWT does not rely on neighboring blocks, which eliminates the inherent data dependency of TZ search. Experimental results show that FMECWT achieves ~35% to ~85% ME time reduction compared to TZ search, with only 0.1% to 1.7% increase in BD-Rate
BLINC: Lightweight Bimodal Learning for Low-Complexity VVC Intra Coding
The latest video coding standard, Versatile Video Coding (VVC), achieves
almost twice coding efficiency compared to its predecessor, the High Efficiency
Video Coding (HEVC). However, achieving this efficiency (for intra coding)
requires 31x computational complexity compared to HEVC, making it challenging
for low power and real-time applications. This paper, proposes a novel machine
learning approach that jointly and separately employs two modalities of
features, to simplify the intra coding decision. First a set of features are
extracted that use the existing DCT core of VVC, to assess the texture
characteristics, and forms the first modality of data. This produces high
quality features with almost no overhead. The distribution of intra modes at
the neighboring blocks is also used to form the second modality of data, which
provides statistical information about the frame. Second, a two-step feature
reduction method is designed that reduces the size of feature set, such that a
lightweight model with a limited number of parameters can be used to learn the
intra mode decision task. Third, three separate training strategies are
proposed (1) an offline training strategy using the first (single) modality of
data, (2) an online training strategy that uses the second (single) modality,
and (3) a mixed online-offline strategy that uses bimodal learning. Finally, a
low-complexity encoding algorithms is proposed based on the proposed learning
strategies. Extensive experimental results show that the proposed methods can
reduce up to 24% of encoding time, with a negligible loss of coding efficiency.
Moreover, it is demonstrated how a bimodal learning strategy can boost the
performance of learning. Lastly, the proposed method has a very low
computational overhead (0.2%), and uses existing components of a VVC encoder,
which makes it much more practical compared to competing solutions
Fine-grain complexity control of HEVC intra prediction in battery-powered video codecs
The high-efficiency video coding (HEVC) standard improves the coding efficiency at the cost of a significantly more complex encoding process. This is an issue for a large number of video-capable devices that operate on batteries, with limited and varying processing power. A complexity controller enables an encoder to provide the best possible quality at any power quota. This paper proposes a complexity control method for HEVC intra coding, based on a Pareto-efficient rate–distortion–complexity (R–D–C) analysis. The proposed method limits the intra prediction for each block (as opposed to existing methods which limit the block partitioning), on a frame-level basis. This method consists of three steps, namely rate-complexity modeling, complexity allocation, and configuration selection. In the first step, a rate-complexity model is presented which estimates the encoding complexity according to the compression intensity. Then, according to the estimated complexity and target complexity, a complexity budget is allocated to each frame. Finally, an encoding configuration from a set of Pareto-efficient configurations is selected according to the allocated complexity and the video content, which offers the best compression performance. Experimental results indicate that the proposed method can adjust the complexity from 100 to 50%, with a mean error rate of less than 0.1%. The proposed method outperforms many state-of-the-art approaches, in terms of both control accuracy and compression efficiency. The encoding performance loss in terms of BD-rate varies from 0.06 to 3.69%, on average, for 90–60% computational complexity, respectively. The method can also be used for lower than 50% complexity if need be, with a higher BD-rate
Fast Motion Estimation Algorithm with Efficient Memory Access for HEVC Hardware Encoders
The encoding process in the HEVC standard is several times more complex than the previous standards. Since motion estimation is responsible for most of this complexity, the new Test Zone (TZ) search is usually adopted as the fast search algorithm, to alleviate the complexity. However, the TZ search requires a high rate of access to the off-chip memory, which contributes heavily to the total consumed encoding power. In this paper we demonstrate that the process of finding the best starting search point in this algorithm, does not allow effective reduction of memory access in hardware encoders. As a solution, a new fast motion estimation algorithm is proposed which estimates a proper single starting search point, in addition to an adaptively reduced search range, based on available information from the coded neighboring blocks. The experimental results show that this algorithm on average can reduce the required memory access for ME by ~78% and reduce the integer ME time by ~70%, with only 1.1% Bjontegaard Delta (BD) Rate.acceptedVersionPeer reviewe
NCOD: Near-Optimum Video Compression for Object Detection
acceptedVersionPeer reviewe
Efficacy and Safety of MLC601 in the Treatment of Mild Cognitive Impairment: A Pilot, Randomized, Double-Blind, Placebo-Controlled Study
Background and Aim: Mild cognitive impairment (MCI) is characterized by declined cognitive function greater than that expected for a person’s age. The clinical significance of this condition is its possible progression to dementia. MLC601 is a natural neuroprotective medication that has shown promising effects in Alzheimer disease. Accordingly, we conducted this randomized, double-blind, placebo-controlled study to evaluate the efficacy and safety of MLC601 in MCI patients. Methods: Seventy-two patients with a diagnosis of MCI were recruited. The included participants were randomly assigned to groups to receive either MLC601 or placebo. An evaluation of global cognitive function was performed at baseline as well as at 3-month and 6-month follow-up visits. Global cognitive function was assessed by Mini-Mental State Examination (MMSE) and Alzheimer’s Disease Assessment Scale-cognitive subscale (ADAS-cog) scores. Efficacy was evaluated by comparing global function scores between the 2 groups during the study period. Safety assessment included adverse events (AEs) and abnormal laboratory results. Results: Seventy patients completed the study, 34 in the MLC601 group and 36 in the placebo group. The mean changes (±SD) in cognition scores over 6 months in the MLC601 group were –2.26 (±3.42) for the MMSE and 3.82 (±6.16) for the ADAS-cog; in the placebo group, they were –2.66 (±3.43) for the MMSE and 4.41 (±6.66) for the ADAS-cog. The cognition changes based on both MMSE and ADAS-cog scores were statistically significant between the placebo and the MLC601 group (p < 0.001). Only 5 patients (14.7%) reported minor AEs in the MLC601 group, the most commonly reported of which were gastrointestinal, none of them leading to patient withdrawal. Conclusion: MLC601 has shown promising efficacy and acceptable AEs in MCI patients